Automated Noise Detection in a Database Based on a Combined Method
نویسندگان
چکیده
Data quality has diverse dimensions, from which accuracy is the most important one. cleaning one of preprocessing steps in data mining consists detecting errors and repairing them. Noise a common type error, that occur database. This paper proposes an automated method based on k-means clustering for noise detection. At first, each attribute (Aj) temporarily removed applied to other attributes. Thereafter, k-nearest neighbors used cluster. After value predicted Aj record by nearest neighbors. The proposed detects noisy attributes using values. Our able identify several noises record. In addition, this can detect fields with different types, too. Experiments show averagely 92% existing data. compared detection association rules. results indicate have improved 13%.
منابع مشابه
A Novel Noise Reduction Method Based on Subspace Division
This article presents a new subspace-based technique for reducing the noise of signals in time-series. In the proposed approach, the signal is initially represented as a data matrix. Then using Singular Value Decomposition (SVD), noisy data matrix is divided into signal subspace and noise subspace. In this subspace division, each derivative of the singular values with respect to rank order is u...
متن کاملA Novel Noise Reduction Method Based on Subspace Division
This article presents a new subspace-based technique for reducing the noise of signals in time-series. In the proposed approach, the signal is initially represented as a data matrix. Then using Singular Value Decomposition (SVD), noisy data matrix is divided into signal subspace and noise subspace. In this subspace division, each derivative of the singular values with respect to rank order is u...
متن کاملtask-based language teaching in iran: a mixed study through constructing and validating a new questionnaire based on theoretical, sociocultural, and educational frameworks
جنبه های گوناگونی از زندگی در ایران را از جمله سبک زندگی، علم و امکانات فنی و تکنولوژیکی می توان کم یا بیش وارداتی در نظر گرفت. زبان انگلیسی و روش تدریس آن نیز از این قاعده مثتسنی نیست. با این حال گاهی سوال پیش می آید که آیا یک روش خاص با زیر ساخت های نظری، فرهنگی اجتماعی و آموزشی جامعه ایرانی سازگاری دارد یا خیر. این تحقیق بر اساس روش های ترکیبی انجام شده است.پرسش نامه ای نیز برای زبان آموزان ...
a study on rate making and required reserves determination in reinsurance market: a simulation
reinsurance is widely recognized as an important instrument in the capital management of an insurance company as well as its risk management tool. this thesis is intended to determine premium rates for different types of reinsurance policies. also, given the fact that the reinsurance coverage of every company depends upon its reserves, so different types of reserves and the method of their calc...
A Trust Based Probabilistic Method for Efficient Correctness Verification in Database Outsourcing
Correctness verification of query results is a significant challenge in database outsourcing. Most of the proposed approaches impose high overhead, which makes them impractical in real scenarios. Probabilistic approaches are proposed in order to reduce the computation overhead pertaining to the verification process. In this paper, we use the notion of trust as the basis of our probabilistic app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistics, Optimization and Information Computing
سال: 2021
ISSN: ['2310-5070', '2311-004X']
DOI: https://doi.org/10.19139/soic-2310-5070-879